基于激光雷达的3D对象检测,语义分割和全景分段通常在具有独特架构的专业网络中实现,这些网络很难相互适应。本文介绍了Lidarmultinet,这是一个基于激光雷达的多任务网络,该网络统一了这三个主要的激光感知任务。在其许多好处中,多任务网络可以通过在多个任务中分享权重和计算来降低总成本。但是,与独立组合的单任务模型相比,它通常表现不佳。拟议的Lidarmultinet旨在弥合多任务网络和多个单任务网络之间的性能差距。 Lidarmultinet的核心是一个强大的基于3D Voxel的编码器架构,具有全局上下文池(GCP)模块,从激光雷达框架中提取全局上下文特征。特定于任务的头部添加在网络之上,以执行三个激光雷达感知任务。只需添加新的任务特定的头部,可以在引入几乎没有额外成本的同时,就可以实现更多任务。还提出了第二阶段来完善第一阶段的分割并生成准确的全景分割结果。 Lidarmultinet在Waymo Open数据集和Nuscenes数据集上进行了广泛的测试,这首先证明了主要的激光雷达感知任务可以统一在单个强大的网络中,该网络是经过训练的端到端,并实现了最先进的性能。值得注意的是,Lidarmultinet在Waymo Open数据集3D语义分割挑战2022中达到了最高的MIOU和最佳准确性,对于测试集中的22个类中的大多数,仅使用LIDAR点作为输入。它还为Waymo 3D对象检测基准和三个Nuscenes基准测试的单个模型设置了新的最新模型。
translated by 谷歌翻译
基于查询的变压器在许多图像域任务中构建长期注意力方面表现出了巨大的潜力,但是由于点云数据的压倒性大小,在基于激光雷达的3D对象检测中很少考虑。在本文中,我们提出了CenterFormer,这是一个基于中心的变压器网络,用于3D对象检测。 CenterFormer首先使用中心热图在基于标准的Voxel点云编码器之上选择中心候选者。然后,它将中心候选者的功能用作变压器中的查询嵌入。为了进一步从多个帧中汇总功能,我们通过交叉注意设计一种方法来融合功能。最后,添加回归头以预测输出中心功能表示形式上的边界框。我们的设计降低了变压器结构的收敛难度和计算复杂性。结果表明,与无锚对象检测网络的强基线相比,有了显着改善。 CenterFormer在Waymo Open数据集上实现了单个模型的最新性能,验证集的MAPH为73.7%,测试集的MAPH上有75.6%的MAPH,大大优于所有先前发布的CNN和基于变压器的方法。我们的代码可在https://github.com/tusimple/centerformer上公开获取
translated by 谷歌翻译
该技术报告介绍了Waymo打开数据集3D语义分割挑战2022的第一名获胜解决方案。我们的网络称为Lidarmultinet,统一了单个框架中的3D语义细分,对象检测和泛型分割等主要激光镜感知任务。 Lidarmultinet的核心是一个强大的基于3D Voxel的编码器网络,具有新型的全局上下文池(GCP)模块,从激光雷达框架中提取全局上下文特征,以补充其本地功能。提出了一个可选的第二阶段,以完善第一阶段的分割或生成准确的全景分割结果。我们的解决方案达到了71.13的MIOU,对于Waymo 3D语义细分测试集的22个类中的大多数是最好的,它的表现优于官方排行榜上所有其他3D语义分段方法。我们首次证明,可以在可以端对端训练的单个强大网络中统一重大激光感知任务。
translated by 谷歌翻译
Bipedal robots have received much attention because of the variety of motion maneuvers that they can produce, and the many applications they have in various areas including rehabilitation. One of these motion maneuvers is walking. In this study, we presented a framework for the trajectory optimization of a 5-link (planar) Biped Robot using hybrid optimization. The walking is modeled with two phases of single-stance (support) phase and the collision phase. The dynamic equations of the robot in each phase are extracted by the Lagrange method. It is assumed that the robot heel strike to the ground is full plastic. The gait is optimized with a method called hybrid optimization. The objective function of this problem is considered to be the integral of torque-squared along the trajectory, and also various constraints such as zero dynamics are satisfied without any approximation. Furthermore, in a new framework, there is presented a constraint called impact invariance, which ensures the periodicity of the time-varying trajectories. On the other hand, other constraints provide better and more human-like movement.
translated by 谷歌翻译
The importance of humanoid robots in today's world is undeniable, one of the most important features of humanoid robots is the ability to maneuver in environments such as stairs that other robots can not easily cross. A suitable algorithm to generate the path for the bipedal robot to climb is very important. In this paper, an optimization-based method to generate an optimal stairway for under-actuated bipedal robots without an ankle actuator is presented. The generated paths are based on zero and non-zero dynamics of the problem, and according to the satisfaction of the zero dynamics constraint in the problem, tracking the path is possible, in other words, the problem can be dynamically feasible. The optimization method used in the problem is a gradient-based method that has a suitable number of function evaluations for computational processing. This method can also be utilized to go down the stairs.
translated by 谷歌翻译
Finding and localizing the conceptual changes in two scenes in terms of the presence or removal of objects in two images belonging to the same scene at different times in special care applications is of great significance. This is mainly due to the fact that addition or removal of important objects for some environments can be harmful. As a result, there is a need to design a program that locates these differences using machine vision. The most important challenge of this problem is the change in lighting conditions and the presence of shadows in the scene. Therefore, the proposed methods must be resistant to these challenges. In this article, a method based on deep convolutional neural networks using transfer learning is introduced, which is trained with an intelligent data synthesis process. The results of this method are tested and presented on the dataset provided for this purpose. It is shown that the presented method is more efficient than other methods and can be used in a variety of real industrial environments.
translated by 谷歌翻译
This paper proposes a perception and path planning pipeline for autonomous racing in an unknown bounded course. The pipeline was initially created for the 2021 evGrandPrix autonomous division and was further improved for the 2022 event, both of which resulting in first place finishes. Using a simple LiDAR-based perception pipeline feeding into an occupancy grid based expansion algorithm, we determine a goal point to drive. This pipeline successfully achieved reliable and consistent laps in addition with occupancy grid algorithm to know the ways around a cone-defined track with an averaging speeds of 6.85 m/s over a distance 434.2 meters for a total lap time of 63.4 seconds.
translated by 谷歌翻译
Convolutional Neural Networks (CNN) have shown promising results for displacement estimation in UltraSound Elastography (USE). Many modifications have been proposed to improve the displacement estimation of CNNs for USE in the axial direction. However, the lateral strain, which is essential in several downstream tasks such as the inverse problem of elasticity imaging, remains a challenge. The lateral strain estimation is complicated since the motion and the sampling frequency in this direction are substantially lower than the axial one, and a lack of carrier signal in this direction. In computer vision applications, the axial and the lateral motions are independent. In contrast, the tissue motion pattern in USE is governed by laws of physics which link the axial and lateral displacements. In this paper, inspired by Hooke's law, we first propose Physically Inspired ConsTraint for Unsupervised Regularized Elastography (PICTURE), where we impose a constraint on the Effective Poisson's ratio (EPR) to improve the lateral strain estimation. In the next step, we propose self-supervised PICTURE (sPICTURE) to further enhance the strain image estimation. Extensive experiments on simulation, experimental phantom and in vivo data demonstrate that the proposed methods estimate accurate axial and lateral strain maps.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
This paper proposes embedded Gaussian Process Barrier States (GP-BaS), a methodology to safely control unmodeled dynamics of nonlinear system using Bayesian learning. Gaussian Processes (GPs) are used to model the dynamics of the safety-critical system, which is subsequently used in the GP-BaS model. We derive the barrier state dynamics utilizing the GP posterior, which is used to construct a safety embedded Gaussian process dynamical model (GPDM). We show that the safety-critical system can be controlled to remain inside the safe region as long as we can design a controller that renders the BaS-GPDM's trajectories bounded (or asymptotically stable). The proposed approach overcomes various limitations in early attempts at combining GPs with barrier functions due to the abstention of restrictive assumptions such as linearity of the system with respect to control, relative degree of the constraints and number or nature of constraints. This work is implemented on various examples for trajectory optimization and control including optimal stabilization of unstable linear system and safe trajectory optimization of a Dubins vehicle navigating through an obstacle course and on a quadrotor in an obstacle avoidance task using GP differentiable dynamic programming (GP-DDP). The proposed framework is capable of maintaining safe optimization and control of unmodeled dynamics and is purely data driven.
translated by 谷歌翻译